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INTRODUCTION 


Polynomial  regression  is  a  methodology  used  to  fit  curvilinear  models  to  a  set  of  obser¬ 
vations.  These  curvilinear  models  fit  into  the  framework  of  the  general  linear  model  and,  hence, 
can  usually  be  fit  to  the  data  using  any  general  multiple  regression  program.  Two  such  programs 
are  currently  available  through  the  Mathematical  Statistics  Staff,  viz.  GEMREG  (GEneral 
Multiple  REGession)*  and  DA-MRCA  (DAhlgren  Multiple  Regression  and  Correlation 
Analysis).**  Both  provide  least  squares  estimates  of  the  regression  coefficients,  analysis  of 
variance  tables,  and  a  variety  of  user-controlled  options.  These  programs  hinge  on  the  assumptions 
that  the  error  terms  (differences  between  the  observed  and  predicted  values  of  the  dependent 
variable)  can  be  assumed  to  have  zero  expectation,  the  same  variance  for  all  observations,  and 
zero  correlation.  When  these  last  two  assumptions  for  the  error  terms  are  not  met,  the  usual 
least  squares  method  is  not  applicable;  instead,  a  weighted  least  squares  procedure  is  required. 

Programs  WEPOR  and  WEPOR2  (WEighted  Polynomial  Regression)  use  this  weighted  least 
squares  procedure  to  estimate  regression  coefficients  for  models  with  one  independent  variable. 
Program  WEPOR  handles  the  case  in  which  the  error  terms  have  different  variances  but  are 
uncorrelated,  whereas  WEPOR2  deals  with  the  problem  of  different  variances  and  correlated 
error  terms.  Output  for  both  programs  includes  ANOVA  (ANalysis  Of  VArianee)  tables,  pre¬ 
dicted  values  of  the  dependent  variable  and  the  associated  residuals,  and  confidence  limits  for 
selected  synthetic  points.  The  values  for  bounds  on  the  entire  curve  generated  from  the  input 
data  are  written  on  output  files  for  use  with  DISSPLA  (Display  Integrated  Software  System  and 
Plotting  LAnguage).t  An  example  of  a  program  that  uses  the  output  from  WEPOR  and 
DISSPLA  features  to  plot  sample  points,  the  regression  curve,  and  confidence  and  prediction 
limits  is  program  LIMITS. 


*Taub,  A.  E.,  and  M.  A.  Thomas,  GEMREG  -  A  General  Multiple  Regression  Program,  NSWC  TN  81-298,  (Dahlgren,  Va.,  1981). 

**Abt,  K.,  G.  Gemmill,  T.  Herring,  and  R.  Shade,  DA-MRCA:  A  Fortran  IV  Program  for  Multiple  Linear  Regression ,  NSWC 
TR-2035,  (Dahlgren,  Va.,  1966). 

tlntegrated  Software  Systems  Corporation  (ISSCO),  Display  Integrated  Software  System  and  Plotting  Language ,  ISSCO  (San 
Diego,  Calif.,  1970). 
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THE  MODEL 


The  polynomial  regression  model  with  a  single  variable  has  form 


yi  =  0o  +  0ixi  +  02  x,2  +  -  +  4xi  +  v 


i=  1,2, 


(1) 


In  this  model,  X{  is  the  value  of  the  independent  variable  associated  with  the  ith  response  value 
(y^,  n  is  the  number  of  observations,  k  is  the  order  of  the  polynomial,  j8.  is  the  jth  regression 
coefficient,  and  is  the  ith  random  error.  The  inclusion  of  e  in  the  model  accounts  for  the 
fact  that  the  response  variable  y  is  a  random  variable  and,  hence,  the  relationship  between  the 
response  variable  and  the  independent  variable  is  not  an  exact  functional  relationship. 


Polynomial  models  fit  into  the  framework  of  the  general  linear  model 


yi  "  +  01  X1  i  +  0 2X2i  +  -  +  0kXki  +  ei’ 


i  =  1,2, 


(2) 


and,  hence,  can  usually  be  fit  to  data  using  any  general  multiple  regression  program  provided 
that  the  e.  can  be  assumed  to  have  zero  expectation,  the  same  variance  a 2  for  all  i,  and  be 
uncorrelated.  These  assumptions  can  be  expressed  in  a  more  compact  form  if  the  model  is 
written  in  matrix  notation: 


y  =  XjS  +  e. 


(3) 


In  the  general  context  of  model  (2),  y  is  an  n  x  1  vector  of  observations,  0  is  a  (k  +  1)  x  1 
vector  of  regression  coefficients,  _e  is  an  n  x  1  vector  of  random  errors,  and 


X 

n  x  (k  +  1) 


1  1  'v2  1 


1  2  'v2  2 


1  X,  X, 

In  2  n 


X 


kl 


k  2 


kn 
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Since  model  (1)  is  a  special  case  of  model  (2),  the  appropriate  X  for  (1)  is  obtained  by 
letting  Xkj.  =  Xj  .  In  this  notation,  the  expectation  of  the  e.  and  their  variance-covariance 
matrix  can  be  denoted  by  E(e)  and  Var(e),  respectively.  Hence,  if  the  e.  are  assumed  to  have 
zero  expectation,  this  is  denoted  by  E(e)  =  0.  Also  if  the  e.  are  assumed  to  be  uncorrelated 
with  the  same  variance,  this  is  denoted  by  Var(e)  =  a2 1  where  I  is  the  n  x  n  identity  matrix. 
In  regression  applications,  the  assumption  E(e)  =  0_  does  not  present  any  difficulty.  However, 
the  assumption  Var(e)  =  a2 1  cannot  always  be  met  and,  hence,  poses  a  serious  problem  if  not 
handled  properly.  In  this  case,  the  variance-covariance  matrix  is  denoted  by  Var(e)  =  a2V 
where  V  is  an  n  x  n  positive  definite  matrix. 

An  example  of  a  regression  application  where  Var(e)  o2l  involves  regressing  projectile 
seating  distances  (y)  on  given  barrel  life  (x)  expressed  in  percent  expended.  Here,  the  variation 
in  seating  distance  increases  with  the  percent  expended  barrel  life.  Hence,  the  assumption  of 
equal  variances  does  not  hold,  and  the  usual  least  squares  regression  is  not  applicable.  Cases  of 
this  kind  and  more  complicated  situations  where  the  errors  are  correlated  can  be  handled  by  a 
modified  least  squares  procedure  known  as  weighted  least  squares.  This  procedure  is  discussed 
by  Draper  and  Smith,*  and  much  of  the  development  that  follows  is  based  on  their  discussion. 

When  the  aforementioned  assumptions  are  satisified,  the  usual  least  squares  pro¬ 
cedure  provides  a  vector  of  estimates  of  the  regression  coefficients  that  has  the  form 

|=  b  =  (X'X)->  X'y.  (4) 

The  weighted  least  squares  procedure  amounts  to  transforming  the  dependent  or  response 
variable  y  to  another  variable  that  does  satisfy  the  assumptions.  The  usual  (unweighted)  least 
squares  analysis  is  then  applied  to  the  new  variable,  and  the  estimates  so  obtained  are  reexpress¬ 
ed  in  terms  of  the  original  variable  y.  This  process  is  examined  in  details  in  the  ensuing  para¬ 
graphs. 

Consider  the  original  model  (Equation  3)  with  assumptions  E(e)  =  0  and  Var(e)  =  a2V 
(vice  a2 1).  Since  V  is  positive  definite,  it  is  possible  to  find  an  upper  triangular  matrix  P  such 
that  P'P  =  V.  (Draper  and  Smith  indicate  that  it  is  possible  to  find  a  unique  nonsingular 
symmetric  matrix  P  such  that  P'P  =  PP  =  P2  =  V.  We  have  not  found  this  to  be  the  case,  nor  is 
such  a  requirement  necessary  in  what  follows.) 


*  Draper,  N.  R.  and  H.  Smith,  Applied.  Regression  Analysis  (New  York,  N.Y.:  John  Wiley  &  Sons,  Inc.,  1966),  pp.  77-81. 
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If  the  model  in  Equation  3  is  premultiplied  by  (P')“ 1 ,  a  new  model  is  generated  in  the 

form 

(p'r1  y  =  (p'r1  xp  +  (p'r1^  (5> 

Since 

E  [(P'r1  e]  =  (P'r1  E(e)  =  (P')"1  0  =  0 

and 

Var  [(P')_1e]  =  E[(P'T1_ee'  P"1]  =  (P'r1  E(e  e'lP"1 

=(P'r 1  VP"1  a2 
=  (P'r1  (P'P)P-1  a 2 
=  Ia2, 

the  new  model  meets  the  assumptions  required  for  the  ordinary  least  squares  procedure.  This 
new  model  can  be  written  in  matrix  notation  as 

z  =  Q  l  +  f  (6) 

where  z  =  (P'r1  y,  Q  =  (P')“ 1  X,  and  f  =  (P'r 1  e. 
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THE  ANALYSIS 


The  error  term  f  in  the  revised  model  in  Equation  6  satisfies  the  assumptions  for  the  usual 
least  squares  analysis.  Therefore,  the  usual  analysis  will  be  applied  to  the  revised  model. 
Estimation  of  the  regression  coefficients  will  be  dealt  with  first.  These  estimates  are  obtained 
by  writing  the  solution  vector  in  Equation  4  in  terms  of  the  new  parameters  in  Equation  6. 
This  provides 


b  =  (Q'Q.r1Q'z  (7) 

Reexpressing  Q  and  z^in  terms  of  the  original  model  parameters  provides 

b=  [(x'p-^ap'r1  x)]-1  (x'p-,x(P'r1  y) 

=  [X'  (P'P)-1  X]  - 1  X(P'P)-1  y 

=  (X'  WX)-1  X'  Wy  .  (8) 

In  this  expression,  W  is  the  inverse  of  V;  i.e.,  W  =  (P'P)-1  =  V-1 . 

This  solution  has  the  same  form  as  Equation  4,  except  for  the  insertion  of  W,  the  weighting 
matrix.  The  new  model  has  an  implied  zero  intercept,  since  the  Q  matrix  does  not  have  a 
leading  column  of  ones.  Hence,  the  entries  in  the  analysis  of  variance  table  for  the  new  model 
are  computed  in  a  slightly  different  manner  from  those  obtained  when  ordinary  least  squares 

procedures  are  used.  Table  1  shows  the  breakdown  of  the  degrees  of  freedom  and  formulae 

needed  to  compute  the  sums  of  squares  for  a  first  degree  polynomial. 
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Table  1 .  Analysis  of  Variance  Table  for  First  Degree  Polynomial 


Source 

Sums  of  Squares 

Degrees  of  Freedom 

K 

<%„),  2i)2/S(qo)1? 

1 

Wo 

b'X'Wy-  SS(j3q) 

1 

Error 

y'  Wy  -  b'X'Wy 

n  -  2 

Total 

/— s 

II 

1 

IN, 

n 

In  this  table,  (qo)i  is  the  ith  element  of  the  first  column  in  the  Q  matrix  [Q  =  (P')  1  X] . 

When  several  observations  are  taken  at  the  same  level  of  the  independent  variable,  the 
error  sum  of  squares  in  Table  1  can  be  broken  into  components  for  lack  of  fit  and  pure  error. 
From  Equations  5  and  6,  we  have 

Z  =  (P'r1  y 

where  z'  =  (Zj ,  z2,  ...  zn).  A  change  in  the  subscript  of  the  z’s  produces 

Z  —  (Zj  J  ,  Zj  2,  ...  zln  ,  Z21  .  •••  z2n  ’  -  Zki  ’  Zk2>  —  ZknJ 
12  k 

where  the  first  n,  values  are  associated  with  the  first  level  of  the  independent  variable,  the  next 
n2  values  are  associated  with  the  second  level,  and  so  on.  With  this  notation,  the  sum  of  squares 
for  pure  error  is  computed  by 

SS  (pe) 


k 

with  degrees  of  freedom  v  =  ^  n.  -  k. 

i=  1 


k  nj 

=  E  £  <zu  -V 

i=i  j=i 


k 

=  E 

i=  1 


£  zij 

"j  „  \j=l 

V  (z..)2  -  - 

^  »J  n 

j=l 


2  1 


(9) 
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The  sum  of  squares  for  lack  of  fit  can  then  be  obtained  by  subtraction;  i.e., 

SS(lf)  =  SSE  -  SS(pe) 

where  SSE  is  the  error  sums  of  squares  from  Table  1. 

Confidence  Emits  on  the  expected  value  of  y  and  prediction  limits  on  the  mean  of  m  future 
observations  of  y  differ  only  slightly  in  weighted  regression  from  unweighted  regression.  The  value 
of  the  independent  variable  X  used  in  forming  the  Emits  is  referred  to  as  the  synthetic  point. 
Letting  x*  denote  the  synthetic  point  associated  with  X  and  (x*)'  =  (1,  x*,  (x*)2,  ...  (x*)k)  for 
a  kth  degree  polynomial, 


(x*)' b  =  bQ  +  bjX*  +  b2(x*)2  +  ...  +  bk(x*)k 

is  a  point  estimate  of  the  expected  value  of  y  and  of  a  single  future  observation  when  X  =  x*. 
In  the  unweighted  case,  the  100(1  -  a)  percent  confidence  Emits  on  E(y)  when  X  =  x*  are 

(x*)' b  ±  tu>  j_a/2  [s2((x*)'(X'Xr1x*)]H  .  (10) 

In  the  weighted  case,  X'X  is  replaced  with  X'  WX  yielding 

(x*yb±tV)i_a/2  [s2((x*),(x'wxr1  x*)]*  .  do 

In  these  expressions,  tWj  i-a/2  Is  the  100  (1  -  a/2)  percentage  point  for  a  t  distribution  with  v 
degree  of  freedom  where  u  is  associated  with  the  error  mean  square  in  the  analysis  of  variance 
table.  This  error  mean  square  is  denoted  by  s2  above  and  is  obtained  by  dividing  the  error  sums 
of  squares  (SSE)  by  v,  the  associated  degrees  of  freedom;  i.e.,  s2  =  SSE/u. 

The  prediction  Emits  for  the  mean  of  m  future  observations  at  X  =  x*  is,  in  the  un- 
weighted  case,  given  by 

(x*y  b  ±  tu>  ,_a/2  ts2(^  +  (x*y  (x'xr1  x*)]%  . 

For  the  weighted  case,  X'X  is  changed  as  above  yielding 

(x*y  b  ±  t„  j_a/2  ts2(  -  +  (x*y  (x'wxr1  x*)]* 

—  —  •  1  m  -  - 


(12) 


(13) 
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PROGRAM  ORGANIZATION 


Program  WEPOR  is  actually  the  main  driving  routine  that  calls  a  series  of  first  level 
subroutines  to  perform  various  tasks. 

First,  subroutine  IOP  is  called  to  read  in  parameters  for  user-specified  input  and  output 
options,  viz.,  a  title  for  the  execution,  the  number  of  observations,  the  desired  degree  of  the 
polynomial  model  to  be  fit  to  the  data,  and  a  parameter  specifying  whether  or  not  confidence 
and  prediction  limits  are  requested.  The  printing  of  these  limits  requires  the  further  input 
of  the  number  of  future  observations  on  which  the  prediction  limits  are  based,  and  the  number 
of  synthetic  points  to  be  read  in  if  the  levels  of  x  to  be  used  for  the  limits  are  different  from 
those  in  the  original  data.  The  validity  of  each  parameter  is  checked  and,  should  inconsistencies 
be  detected,  either  a  default  value  is  substituted  or  an  error  message  printed  and  execution 
halted. 

Subroutine  READIN  is  called  to  read  in  the  raw  input  data  and  the  array  of  weights 
corresponding  to  the  diagional  elements  of  the  matrix  W  =  V"1  (recall  that  program  WEPOR 
assumes  uncorrelated  error  terms;  should  correlations  exist,  program  WEPOR2  should  be  used 
and  the  entire  matrix  V  is  read  in  at  this  point).  The  required  matrix  P,  where  (P'P)”1  =  W,  is 
computed  at  this  time.  If  V  is  a  diagonal  matrix,  the  elements  of  P^1  are  calculated  by  simply 
taking  the  square  root  of  the  corresponding  elements  of  W.  In  the  case  where  V  is  non¬ 
diagonal,  P  is  obtained  by  performing  a  matrix  decomposition  on  V  using  the  square  root 
method.  The  total  sum  of  squares  and  sum  of  squares  due  to  pure  error  are  also  computed.  The 
raw  data  points  are  saved  on  TAPE  10. 

For  each  stage  of  development  in  the  model,  subroutine  REGRESS  is  called  to  compute 
and  print  a  set  of  regression  coefficients,  the  additional  sum  of  squares  to  be  included  in  the 
regression  sum  of  squares,  and  the  residual  sum  of  squares  and  F  statistic  for  the  current  model. 

Subroutine  TABLE  prints  two  analysis  of  variance  tables:  one  shows  the  breakdown  of 
the  residual  sum  of  squares  into  components  of  pure  error  and  lack  of  fit  and  the  other  table 
shows  the  contribution  made  by  each  term  in  the  model  to  the  overall  regression  sum  of 
squares.  This  subroutine  also  prints  the  raw  input  data,  estimated  values  for  the  dependent 
variable,  and  residuals. 

If  the  user  has  requested  confidence  and  prediction  limits,  subroutine  SYNTH  performs 
the  necessary  calculations.  If  the  user  has  specified  that  a  new  set  of  levels  for  x.  are  to  be  used 
instead  of  those  from  the  original  data  set,  these  new  levels  are  read  in  from  the  input  file. 
The  confidence  and  prediction  Emits  are  printed  and  are  also  saved  on  TAPE  11  and  TAPE  12, 
respectively. 
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Some  of  these  first  level  subroutines  reference  routines  found  in  the  NSWC/DL  Library  of 
Mathematics  Subroutines :*  CROUT,  which  inverts  general  real  matrices,  and  MPROD  and 
TMPROD,  which  perform  matrix  multiplication  operations.  Routine  QSORT  from  the  User’s 
Guide  for  the  CDC  6700  Computing  System **  is  used  to  arrange  the  levels  of  the  independent 
variable  in  ascending  order.  Subroutine  FINDT,  used  to  estimate  the  critical  t  value  for  con¬ 
fidence  and  prediction  units,  is  adapted  from  a  similar  routine  in  program  GEMREG.^ 


EXAMPLE 


For  the  application  of  projectile  seating  distance  (psd)  expressed  as  a  function  of  percent 
gun  barrel  life  expended,  the  following  independent  and  uncorrelated  pairs  of  data  points  were 
used  to  derive  a  first  degree  polynomial. 


Percent  Barrel  Life 

psd 

Expended 

(m) 

0 

39.82 

0 

39.71 

10 

41.13 

10 

41.10 

30 

43.52 

30 

43.90 

60 

48.05 

60 

46.31 

75 

47.23 

75 

48.68 

Although  the  weight  associated  with  each  level  of  the  independent  variable  is  the  inverse 
of  the  variance  for  the  response  variable  at  that  level,  these  variances  are  unknowns.  Estimates 
based  on  the  above  data  and  data  from  previous  experiments  were  used  to  construct  the  follow¬ 
ing  weights: 


*  Morrison,  Alfred  H.  Jr.,  NSWC/DL  Library  of  Mathematics  Subroutines,  NSWC  TR  81-410  (Dahlgren,  Va.,  1981). 

**User’s  Guide  for  the  CDC  6700  Computing  System,  NSWC  TR-3228  (Dahlgren,  Va.,  1974). 

^Tauh,  A.  E.  and  M.  A.  Thomas,  1981. 
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Value  of  Independent 
Variable 


Weight 


0 

12.50 

10 

10.00 

30 

5.00 

60 

1.40 

75 

1.25 

Appendix  A  provides  the  input  guide  for  execution  of  programs  WEPOR  and  WEPOR2. 
The  actual  cards  used  for  this  example  are  shown  in  Appendix  B. 

Computation  of  the  estimates  for  the  regression  parameters  requires  Equation  8: 

b  =  (X'WX)-1  X'Wy 


where 


1 

0. 

12.5 

0 . 0 

1 

0. 

0 

12.5 . 

1 

10. 

10.0 . 

1 

10. 

.  10.0 . 

1 

30. 

,  w  = 

.  5.0 . 

1 

30. 

.  5.0  . 

1 

60. 

. 1.4  . 

1 

60. 

. 1.4  . 

1 

75. 

. 1.25  . 

_1 

75. 

0 

. 1.25 

and  y 


39.82 

39.71 

41.13 

41.10 

43.52 

43.90 

48.05 

46.31 

47.23 

48.68 
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The  following  matrix  operations  are  required  before  proceeding: 


(X'WX)  =  T  60.30  855. 50_ 

_855.50  35142.50 


(X'WX)-1  =  2.5  X  10"2  -6.1  X  10-41  * 

-6.2  X  10-4  4.35  X  10"  5 


(X'Wy)  =1"  2505.52]  • 

38253. 80j 

Recall  that  there  exists  a  matrix  P  such  that  (P'P)-1  =  W.  The  vector  z_=  (P')~ 1  y  is  then 
computed 


— 

— 

VnJ  o  . o 

39.82 

140.78 

o  Vnl . 

39.71 

140.40 

.  Viol) . 

41.13 

130.06 

.  Vio.o . 

41.10 

129.97 

.  >/5d) . 

43.52 

= 

97.32 

. V5X)  ... 

43.90 

98.16 

. Via 

48.05 

57.43 

. vT4  . 

46.31 

55.35 

. VT25 

47.23 

52.80 

o  .  . VT25 

48.68 

54.43 

— 

—  — 

L 

*  For  the  sake  cf  clarity,  rounded  values  will  be  given  for  the  results  of  matrix  operations. 


li 


Similarly,  the  vector  qo ,  which  is  equal  to  the  first  column  in  the  Q  matrix  (=  (P')  1  X),  is 
found  to  be 


% 


x/ro 

v/n3 


x/5.0 

V^O 

Via 

Via 

V05 

V05 


The  vector  of  estimates  for  the  regression  parameters,  vector  b,  is  therefore  equal  to 

b  =  (X'WX)- 1  X'Wy  = 

This  result  is  found  on  the  first  page  of  the  printout  (Appendix  B,  page  B-4).  For  the  analysis 
of  variance  tables  printed  on  pages  two  and  three  of  the  printout  (pages  B-4,  B-5),  the  follow¬ 
ing  operations  are  performed: 


39.88 

0.12 


SSR  =  b'  (X'  Wy)  =  104424.89 


1  o 


1  o 


ss  (P0>  =  ( £  (q 0)^i)2/E  «U2 


1=1 


i=  1 


104106.35 


SS  03j  |j8o )  =  SSR  -  SS(j30)  =  318.54 
SS  Total  =  y'Wy  =  _z'z  =  104431.63 
SSE  =  SS  Total  -  SSR  =  6.74 
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Since  each  level  of  the  independent  variable  has  two  observations  associated  with  it,  Equa¬ 
tion  9  for  computing  the  sum  of  squares  due  to  pure  error  can  be  written  as 

SS  (pe)  =  £  £  (Zy  -  z.)2 
i=i  j=l 


and  is  found  to  be  3.87.  Finally, 


SS(lf)  =  SSE  -  SS(pe)  =  2.87. 

Page  two  of  the  printout  (page  B-4)  shows  the  analysis  of  the  variance  table,  including  a 
breakdown  of  the  error  sum  of  squares  into  the  sum  of  squares  due  to  pure  error  and  lack  of 
fit.  The  associated  degrees  of  freedom  and  mean  squares  are  printed  for  the  regression  and  error 
terms. 


The  F  statistic  to  test  the  lack  of  fit  component  [F  =  MS(lf)/MS(pe)]  is  1.23.  This  result 
is  less  than  5.41,  the  critical  F  value  for  a  =  0.05  with  degrees  of  freedom  3  and  5,  and 
indicates  that  the  first  degree  polynomial  model  chosen  is  not  inadequate  at  the  0.05  level. 

On  the  third  page  of  the  printout  (page  B-5)  is  an  analysis  of  variance  table  that  shows 
the  contribution  made  by  each  term  in  the  model.  The  sum  of  squares  due  to  regression  is 
determined  for  polynomials  of  degree  from  0  Q3q  only  in  model)  to  the  full  model  chosen.  At 
each  stage,  the  additional  sum  of  squares  is  computed  and  stored  for  use  in  this  table. 


The  sum  of  squares  represented  by  X**0  is  that  associated  with  the  regression  model 
having  only  in  it  [SS(j3q)].  In  our  example,  this  value  is  104106.35.  The  nth  sum  of  squares 
listed,  X**n,  represents  the  additional  sum  of  squares  obtained  by  adding  (3n  to  the  model 
that  already  contains  PQ,  P1  ...  Pn_1  and  can  be  computed  as  follows: 


ssw»  p„  .  o,  ■  dj  - A-. )  -  ss«„ 

In  this  example, 


SS(/3,  |0o)  =  SS(0o,  P1)  ~  SS(po) 
or  318.54  =  104424.89  -  104106.35. 

The  column  with  the  heading  “F  Statistics  -  MSR/MSE”  shows  the  values  of  the  F  test  for 
the  model  at  each  stage  of  development. 
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The  fourth  page  of  the  printout  lists  the  case  numbers,  values  x.  of  the  independent 
variable,  observed  values  y.  of  the  dependent  variable,  the  estimated  values  y.  for  the  de¬ 
pendent  variable  based  on  the  full  regression  equation,  and  the  residuals  yi  -  y.  (page  B-5). 

As  in  this  example,  the  error  terms  were  presumed  to  be  uncorrelated,  which  indicates  a 
diagonal  covariance  matrix  V.  Therefore,  only  the  array  of  weights  read  in  as  part  of  the  input 
and  representing  the  diagonal  elements  of  W  =  V”1  are  printed  with  their  associated  cases.  If 
the  error  terms  had  not  been  assumed  to  be  uncorrelated,  the  lower  triangular  position  of  W 
would  also  have  been  printed. 

The  minimum  and  maximum  absolute  residuals  (min  |y.  -  y.|  and  max  |y.  -  y.|)  are  also 
provided. 

The  user  has  the  option  of  requesting  confidence  limits  at  the  IOO7  percent  level,  where 
1~7  is  specified  by  the  user.  These  limits  may  be  placed  about  the  estimated  values  yi  for  the 
original  levels  of  X  or  for  up  to  100  other  synthetic  points. 

At  the  same  time,  IOO7  percent  prediction  limits,  based  on  the  predicted  mean  of  m  new 
observations  at  the  same  levels  of  X  as  used  for  the  confidence  limits  may  be  requested.  The 
value  of  m  is  also  user  provided.  Pages  five  and  six  of  the  printout  show  95-percent  confidence 
and  prediction  limits  using  the  original  input  values  for  the  levels  of  X.  The  prediction  limits 
are  based  on  the  predicted  value  of  a  single  future  observation  at  each  level  of  X. 
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APPENDIX  A 

INPUT  GUIDE  FOR  WEPOR  AND  WEPOR2 


A-l 


Input  Guide  for  WEPOR  and  WEPOR2 


Card  No. 

Variable 

Description 

Columns 

Format 

i 

I  TITLE 

Title  for  run 

1-80 

8A10 

2 

NOBS 

Number  of  observations 

NOBS  >  0  data  on  cards 

NOBS  <  0  data  on  TAPE8  (Must 
be  attached  prior  to  execution) 

1-5 

15 

WEPOR:  2  <  |  NOBS  |  <  750 

WEPOR2:  2  <  |  NOBS  |  <  |  100 

KMAX 

Desired  degree  of  polynomial  model 

6-10 

15 

COPT 

Confidence/prediction  limit  option 

COPT  =  0  no  intervals 

=  1  confidence  intervals  only 
=  2  confidence  and  prediction  intervals 
Default:  0 

11-15 

15 

(used  only  if 
COPT  =  1,2) 

NPTS 

Number  of  synthetic  points  for 
confidence/prediction  limits 

NPTS  =  0  use  original  xt  values 

1-5 

15 

NPTS  <  100 

AR 

AR  =  (1  -  7)  for  IOO7  percent  limits 

0  <  AR  <  1.0 

Default:  0.05 

6-10 

F5.2 

A-3 


Card  No. 


Variable 


Description 


Columns  Format 


4 

5 


6 

7 


8 

(used  only  if 
COPT  =  1,2 
and  NPTS  >  0) 


M  Number  of  future  observations 

prediction  limits  based  on 
Default :  1 


FORM1 

X 

Y 


FORM2 

WEPOR:W 

WEPOR2:V 


XPTS 


Format  used  to  read  in  (x,y)  pairs 

Independent  variable  level 

Dependent  variable  observation 

(Repeat  Card  5  as  needed) 

Format  used  to  read  in  “weights” 

Array  of  weights  (diagonal 
elements  of  W  =  V- 1 ) 

Covariance  matrix 

(Repeat  Card  7  as  needed) 

Synthetic  points  -  -  levels  of 
independent  variable 


10-15 


1-80 


1-80 


A-4 


15 

8A10 

FORM1 

FORM1 

8A10 

FORM2 

FORM2 

FORM1 


APPENDIX  B 

SAMPLE  INPUT  AND  OUTPUT  FOR  YVEPOR  AND  WEPOR2 
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Sample  Input  Deck 
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APPENDIX  C 

SAMPLE  PLOTS  PRODUCED  BY  PROGRAM  LIMITS 


C-l 


Program  LIMITS  uses  the  graphics  package  DISSPLA  (Reference  3  in  text)  to  plot  the 
confidence  and  prediction  limits  generated  by  programs  WEPOR  and  WEPOR2.  Local  files 
produced  by  these  two  programs  and  used  as  input  for  LIMITS  are  TAPE10  (raw  data),  TAPE1 1 
(confidence  limits),  and  TAPE12  (prediction  limits).  Figures  C-l  and  C-2  were  drawn  lining  the 
results  of  the  example  discussed  on  page  9. 


C-3 


Figure  C-l .  95-Percent  Confidence  Limits:  Projectile  Seating  Distance  Expressed  as  a  Function  of 

Percent  Gun  Barrel  Life  Expended 


Figure  C-20.  95-Percent, Prediction  Limits:  Projectile  Seating  Distance  Expressed  as  a  Function 

of  Percent  Gun  Barrel  Life  Expended 
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